[api][runtime] Introduce long-term memory in python #332

wenjin272 · 2025-11-20T08:11:16Z

Linked issue: #331

Purpose of change

Introduce the long-term memory interface in python, and provide an implementation based on chroma.

This is the first pr of three to introduce long-term memory in python:

interface and one implementation
support using long-term memory in action
async interface and execution

Tests

Unit test

API

Yes, add long-term memory related api.

Documentation

doc-needed

python/flink_agents/api/memory/long_term_memory.py

xintongsong · 2025-11-30T10:06:25Z

python/flink_agents/api/memory/long_term_memory.py

+    SUMMARIZE = "summarize"
+
+
+class ReduceSetup(BaseModel):


I'd suggest to name this CompactionStrategy, and make it an abstract class that we can provide different implementations, so we can have strict limit on which arguments should be specified for each strategy. We can call the current ReduceStrategy CompactionStrategyType.

I think CompactionStrategy.trim(n) might be more straightforward for users, compared to ReduceSetup.trim_setup(n).

+1 on that Compaction is a better terminology. This Compaction term is commonly used in many open source software in the industry today.

Should we consider using noun as the name of each strategy. For example, summarize -> summarization, trim -> truncation?

python/flink_agents/api/memory/long_term_memory.py

python/flink_agents/runtime/memory/chroma_long_term_memory.py

xintongsong · 2025-11-30T11:04:21Z

python/flink_agents/runtime/memory/chroma_long_term_memory.py

+        if memory_set.size >= memory_set.capacity:
+            # trigger reduce operation to manage memory set size.
+            self._reduce(memory_set)


This can be extremely slow. We should proactively do the compaction.

xintongsong · 2025-11-30T11:07:19Z

python/flink_agents/runtime/memory/chroma_long_term_memory.py

+        self.client.delete_collection(name=name)
+
+    @override
+    def add(self, memory_set: MemorySet, memory_item: str | ChatMessage) -> None:


I had a feeling that adding items to long-term memory can take time, for embedding. We probably should also provide async apis.

xintongsong · 2025-11-30T11:08:16Z

python/flink_agents/runtime/memory/chroma_long_term_memory.py

+        return self.slice(memory_set=memory_set, offset=offset, n=n)
+
+    @override
+    def search(


python/flink_agents/runtime/memory/chroma_long_term_memory.py

wenjin272 · 2025-12-01T09:15:31Z

Hi, @alnzng. There's a design issue related to the vector store that I'd appreciate your help reviewing.

As describe in the design doc #339, long-term memory of flink-agents is also based on vector store. Currently, I provide an implementation based on chroma. In this implementation, I directly use chroma client rather than flink-agents BaseVectorStore, because there are some long-term memory needed interface not provided in BaseVectorStore.

@xintongsong believes that we can directly build upon the Flink-Agents BaseVectorStore. Thus, we can support using any already supported vector store as the backend for long-term memory.

I think this make sense, but it requires add some interfaces to BaseVectorStore, which maybe look like:

def get_or_create_collection(self, name: str, metadata: Dict[str, Any]) -> Dict[str, Any]:
    """Get a collection, create if it doesn't already exist."""
    
def get_collection(self, name: str) -> None:
    """Get an existing collection."""
    
def update_collection(self, name: str, metadata: Dict[str, Any]) -> None:
    """Update an existing collection."""

def delete_collection(self, name: str) -> bool:
    """Delete a collection."""
    
def add(self, document: Document, collection_name: str | None = None) -> None:
    """Add a document to the collection."""

def update(self, document: Document, collection_name: str | None = None) -> None:
    """Update a document, can only update metadata."""

def delete(self, offset: int | None, limit: int | None, ids: List[int] | None = None, **kwargs: Any) -> bool:
    """Delete documents from collection."""

def get(self, offset: int | None, limit: int | None, collection_name: str | None = None, **kwargs: Any) -> List[Document]:
    """Get documents from collection."""

These interface may not be achievable for each vector store, I will conduct research and refinement afterward.
WDTY?

alnzng · 2025-12-03T00:19:29Z

Hi, @alnzng. There's a design issue related to the vector store that I'd appreciate your help reviewing.

As describe in the design doc #339, long-term memory of flink-agents is also based on vector store. Currently, I provide an implementation based on chroma. In this implementation, I directly use chroma client rather than flink-agents BaseVectorStore, because there are some long-term memory needed interface not provided in BaseVectorStore.

@xintongsong believes that we can directly build upon the Flink-Agents BaseVectorStore. Thus, we can support using any already supported vector store as the backend for long-term memory.

I think this make sense, but it requires add some interfaces to BaseVectorStore, which maybe look like:
def get_or_create_collection(self, name: str, metadata: Dict[str, Any]) -> Dict[str, Any]:
    """Get a collection, create if it doesn't already exist."""
    
def get_collection(self, name: str) -> None:
    """Get an existing collection."""
    
def update_collection(self, name: str, metadata: Dict[str, Any]) -> None:
    """Update an existing collection."""

def delete_collection(self, name: str) -> bool:
    """Delete a collection."""
    
def add(self, document: Document, collection_name: str | None = None) -> None:
    """Add a document to the collection."""

def update(self, document: Document, collection_name: str | None = None) -> None:
    """Update a document, can only update metadata."""

def delete(self, offset: int | None, limit: int | None, ids: List[int] | None = None, **kwargs: Any) -> bool:
    """Delete documents from collection."""

def get(self, offset: int | None, limit: int | None, collection_name: str | None = None, **kwargs: Any) -> List[Document]:
    """Get documents from collection."""
These interface may not be achievable for each vector store, I will conduct research and refinement afterward. WDTY?

Thanks for raising this design question @wenjin272! I agree with @xintongsong that building long-term memory on top of BaseVectorStore would be beneficial for consistency and reusability.

The current BaseVectorStore is intentionally minimal, focusing on read-only semantic search. This design works well for RAG retrieval use cases but indeed lacks the CRUD operations needed for long-term memory management. Adding document-level CRUD operations to BaseVectorStore makes sense, as these operations are generally supported across all major vector stores.

A few thoughts on the proposed methods:

add(), delete(), get() - Widely supported, should be included
update() - I'm not certain about common use cases for updating existing long-term memory entries, but it shouldn't be
problematic to include.

However, I have concerns about including collection management in the base interface. This concept is vector
store specific and may cause integration issues. "Collection" is most common term (e.g. Chroma, Milvus, Qdrant, etc), but Pinecone and Weaviate use different terms(e.g. Class, Namespace, etc).

My suggestion: Keep collection management implementation specific rather than in the base interface. Each vector store integration can expose collection management using its native terminology and patterns.

This approach aligns with how LlamaIndex and LangChain handle vector stores. I'd recommend you reviewing these implementations, they provide good precedents for balancing abstraction with flexibility.

LlamaIndex BasePydanticVectorStore:
https://github.com/run-llama/llama_index/blob/main/llama-index-core/llama_index/core/vector_stores/types.py#L334
LangChain VectorStore: https://github.com/langchain-ai/langchain/blob/master/libs/core/langchain_core/vectorstores/base.py#L43

alnzng · 2025-12-02T20:23:09Z

python/flink_agents/api/memory/long_term_memory.py

+    SUMMARIZE = "summarize"
+
+
+class ReduceSetup(BaseModel):


+1 on that Compaction is a better terminology. This Compaction term is commonly used in many open source software in the industry today.

Should we consider using noun as the name of each strategy. For example, summarize -> summarization, trim -> truncation?

alnzng · 2025-12-03T00:23:01Z

python/flink_agents/runtime/memory/tests/start_chroma_server.sh

+#
+
+path=$1
+chroma run --path $path


is this script must to have?

IIUC, You can just add "import chromadb" in the test class which will "start" the chroma automatically. I did this in test_chroma_vector_store.

IIUC, the import chromadb will start the chroma in In-Memory mode. For long-term memory, we recommend Server-Client mode or Cloud mode for data persistence, so I start a chroma server here.

To enable PersistentClient[1], specify persist_directory parameter while building the ChromaVectorStore should help. IIUC, this is similar to what "--path $path" parameter offer in chroma CLI.

[1] https://docs.trychroma.com/docs/run-chroma/persistent-client

wenjin272 · 2025-12-03T02:34:32Z

Hi, @alnzng. There's a design issue related to the vector store that I'd appreciate your help reviewing.
As describe in the design doc #339, long-term memory of flink-agents is also based on vector store. Currently, I provide an implementation based on chroma. In this implementation, I directly use chroma client rather than flink-agents BaseVectorStore, because there are some long-term memory needed interface not provided in BaseVectorStore.
@xintongsong believes that we can directly build upon the Flink-Agents BaseVectorStore. Thus, we can support using any already supported vector store as the backend for long-term memory.
I think this make sense, but it requires add some interfaces to BaseVectorStore, which maybe look like:
def get_or_create_collection(self, name: str, metadata: Dict[str, Any]) -> Dict[str, Any]:
    """Get a collection, create if it doesn't already exist."""
    
def get_collection(self, name: str) -> None:
    """Get an existing collection."""
    
def update_collection(self, name: str, metadata: Dict[str, Any]) -> None:
    """Update an existing collection."""

def delete_collection(self, name: str) -> bool:
    """Delete a collection."""
    
def add(self, document: Document, collection_name: str | None = None) -> None:
    """Add a document to the collection."""

def update(self, document: Document, collection_name: str | None = None) -> None:
    """Update a document, can only update metadata."""

def delete(self, offset: int | None, limit: int | None, ids: List[int] | None = None, **kwargs: Any) -> bool:
    """Delete documents from collection."""

def get(self, offset: int | None, limit: int | None, collection_name: str | None = None, **kwargs: Any) -> List[Document]:
    """Get documents from collection."""
These interface may not be achievable for each vector store, I will conduct research and refinement afterward. WDTY?
Thanks for raising this design question @wenjin272! I agree with @xintongsong that building long-term memory on top of BaseVectorStore would be beneficial for consistency and reusability.

The current BaseVectorStore is intentionally minimal, focusing on read-only semantic search. This design works well for RAG retrieval use cases but indeed lacks the CRUD operations needed for long-term memory management. Adding document-level CRUD operations to BaseVectorStore makes sense, as these operations are generally supported across all major vector stores.

A few thoughts on the proposed methods:

add(), delete(), get() - Widely supported, should be included

update() - I'm not certain about common use cases for updating existing long-term memory entries, but it shouldn't be
problematic to include.

However, I have concerns about including collection management in the base interface. This concept is vector store specific and may cause integration issues. "Collection" is most common term (e.g. Chroma, Milvus, Qdrant, etc), but Pinecone and Weaviate use different terms(e.g. Class, Namespace, etc).

My suggestion: Keep collection management implementation specific rather than in the base interface. Each vector store integration can expose collection management using its native terminology and patterns.

This approach aligns with how LlamaIndex and LangChain handle vector stores. I'd recommend you reviewing these implementations, they provide good precedents for balancing abstraction with flexibility.

LlamaIndex BasePydanticVectorStore:
https://github.com/run-llama/llama_index/blob/main/llama-index-core/llama_index/core/vector_stores/types.py#L334

LangChain VectorStore: https://github.com/langchain-ai/langchain/blob/master/libs/core/langchain_core/vectorstores/base.py#L43

Thanks for your suggestion @alnzng.

The reason I want to provide collection level operation in BaseVectorStore is LongTermMemory can dynamic create or delete MemorySet, which is correspond to a Collection in vector store.

I have reviewed the vector store design in langChain, it indeed doesn't provide collection management in base interface. But if we keep collection management implementation specific rather than in the base interface, the LongTermMemory must be aware of what vector store implementation is used when create or get a MemorySet.

I think all the vector store provide concept like collection, they may called index, namespace or class. The collection in BaseVectorStore is represent a flink-agents collection, and can be correspond to collection in milvus, chroma, quota , index in opensearchpy and class or namespace in pinecone or weaviate. Because collection is the most used name, so we use collection also.

xintongsong · 2025-12-03T03:08:34Z

Maybe we can have two separate interfaces, something like: vector_store & collection_manageable (or collection_manageable_vector_store who extends vector_store).

In order to be used as a backend of LTM, the integration must implements both interfaces.
For integration that doesn't implement collection_manageable (incompatible, or simply haven't done it), it can still be used for rag-like use cases as long as it implements vector_store.

alnzng · 2025-12-03T06:38:30Z

I think all the vector store provide concept like collection, they may called index, namespace or class.

@wenjin272 This was my major concern part since I was not able to verify all existing vector stores and we can't predict for new future vector stores. For example, Pinecone uses namespace + index to manage its data, namespace which seems like a "Collection". However it doesn't seem to support namespace(medata) update operation, which means below method won't work with Pinecone: https://docs.pinecone.io/reference/api/2025-10/data-plane/createnamespace

def update_collection(self, name: str, metadata: Dict[str, Any]) -> None:

@xintongsong suggestion on adding a new interface for indicating collection support seems to be a safe solution.

wenjin272 · 2025-12-09T11:27:02Z

Hi, @alnzng, I add some interfaces to BaseVectorStore and add a new base class CollectionManageableVectorStore for vector store could provide collection level management. Please take a look at your convenience, thanks.

The async execution for add, search, get, delete and compaction for long-term memory will be implemented in follow PRs.

xintongsong · 2025-12-10T03:34:33Z

python/flink_agents/api/vector_stores/vector_store.py

        """
+
+    @abstractmethod
+    def add_embedding(


add_embedding and query_embedding are not meant for users to call. We probably should mark them as protected.

xintongsong · 2025-12-10T03:35:06Z

python/flink_agents/api/vector_stores/vector_store.py

+class Collection(BaseModel):
+    """Represents a collection of documents."""
+    name: str
+    size: int


What is this size used for and how do we keep it consistent?

xintongsong · 2025-12-10T03:36:34Z

python/flink_agents/api/vector_stores/vector_store.py

+            The deleted collection
+        """
+
+def maybe_cast_to_list(value: Any | List[Any]) -> List[Any] | None:


Should this be private?

xintongsong · 2025-12-10T03:41:30Z

python/flink_agents/api/memory/long_term_memory.py

+class LongTermMemoryBackend(Enum):
+    """Backend for Long-Term Memory."""
+
+    VectorStore = "vectorstore"


Suggested change

VectorStore = "vectorstore"

EXTERNAL_VECTOR_STORE = "external_vector_store"

xintongsong · 2025-12-10T03:46:23Z

python/flink_agents/api/memory/long_term_memory.py

+    item_type: Type[str] | Type[ChatMessage]
+    capacity: int
+    compaction_strategy: CompactionStrategy
+    size: int = Field(default=0, exclude=True)


It's fragile to maintain the size separately from the underlying actual collection. We might consider get the size from the store.

xintongsong · 2025-12-10T03:58:30Z

python/flink_agents/runtime/memory/compaction_functions.py

+    from flink_agents.api.prompts.prompt import Prompt
+
+
+def summarize(


It doesn't make sense to always compact the entire long term memory into 1 message. I think we should only merge similar messages, and discard meaning less one. As a result, we should get a smaller set of messages with higher information density.

It might require more efforts than we can take in this PR to tune the compaction strategy in order to get an ideal performance. But at least in the abstraction we should not assume this only returns one message.

xintongsong · 2025-12-10T04:06:41Z

python/flink_agents/api/memory/long_term_memory.py

+        """
+
+    @abstractmethod
+    def delete_memory_set(self, name: str) -> MemorySet:


It probably make sense to just return a boolean. The memory set should be no longer bond with the underlying store.

alnzng · 2025-12-10T06:53:37Z

python/flink_agents/runtime/memory/tests/start_chroma_server.sh

+#
+
+path=$1
+chroma run --path $path


To enable PersistentClient[1], specify persist_directory parameter while building the ChromaVectorStore should help. IIUC, this is similar to what "--path $path" parameter offer in chroma CLI.

[1] https://docs.trychroma.com/docs/run-chroma/persistent-client

alnzng · 2025-12-10T07:01:22Z

python/flink_agents/api/vector_stores/vector_store.py

+    )
+    id: str | None = Field(
+        default=None, description="Unique identifier of the document."
+    )


Out of curiosity, did we change the code format stype? seems there are many diffs like this

I just run ruff format before commit the codes. Looks like ruff check only checks the lint rule, while ruff format only format the code style.

alnzng · 2025-12-10T07:13:18Z

python/flink_agents/api/vector_stores/vector_store.py

        """

+    def add(
+        self, documents: Document | List[Document], collection_name: str | None = None, **kwargs: Any


The newly added CRUD methods look flexible enough to handle operations across multiple collections, which is good. However, the current vector store implementation is limited to a single collection. To keep the behavior consistent, we should update the existing implementation to support multiple collections as well. For example, For example, the queryimplementation needs to be extended to support retrieval from multiple collections to align with the new CRUD methods.

alnzng · 2025-12-10T07:35:53Z

python/flink_agents/api/vector_stores/vector_store.py

+        )
+
+        # Generate embeddings for each document
+        embeddings = [embedding_model.embed(doc.content) for doc in documents]


Chroma seems to have a size limitation on each request when persisting embeddings, batches should contain fewer than 41,666 embeddings. This has been discussed in the Chroma community (see GitHub issue #1049). I’m not sure if this restriction has changed recently, but to be safe we should add a check for requests exceeding 41,666 embeddings, similar to how LlamaIndex handles it today: https://github.com/run-llama/llama_index/blob/main/llama-index-integrations/vector_stores/llama-index-vector-stores-chroma/llama_index/vector_stores/chroma/base.py#L97C1-L97C72

That being said, it may be better to abstract the embedding function at the document level and allow each vector store implementation to override it as needed.

…de a vector store based implementation.

address comments. fix

…d provide a vector store based implementation. address comments.

xintongsong

LGTM.

@alnzng, would you like to take another look as well?

xintongsong · 2025-12-16T06:01:19Z

python/flink_agents/runtime/memory/compaction_functions.py

+    response: ChatMessage = _generate_summarization(
+        items, memory_set.item_type, strategy, ctx
+    )


The amount of memory to be compacted can be large and may exceed the context window of the model. We should take that into consideration. Let's add a todo here and create a follow-up issue.

github-actions bot added priority/major Default priority of the PR or issue. fixVersion/0.2.0 The feature or bug should be implemented/fixed in the 0.2.0 version. doc-needed Your PR changes impact docs. labels Nov 20, 2025

wenjin272 changed the title ~~Long term memory~~ [api][runtime] Introduce long-term memory in python Nov 20, 2025

wenjin272 force-pushed the long-term-memory branch from e85caea to 7c14985 Compare November 27, 2025 10:56

xintongsong reviewed Nov 30, 2025

View reviewed changes

alnzng reviewed Dec 3, 2025

View reviewed changes

wenjin272 force-pushed the long-term-memory branch 3 times, most recently from 94bee98 to c80d570 Compare December 9, 2025 11:16

wenjin272 requested a review from alnzng December 9, 2025 11:24

xintongsong reviewed Dec 10, 2025

View reviewed changes

alnzng reviewed Dec 10, 2025

View reviewed changes

wenjin272 added 4 commits December 12, 2025 17:35

[api][integration] Enrich interface for vector store.

3c31a3c

[api][runtime][python] Introduce long-term memory interface and provi…

ab42483

…de a vector store based implementation.

fixup! [api][integration] Enrich interface for vector store.

3b4dcad

address comments. fix

fixup! [api][runtime][python] Introduce long-term memory interface an…

8368345

…d provide a vector store based implementation. address comments.

wenjin272 force-pushed the long-term-memory branch from 9107487 to 8368345 Compare December 12, 2025 09:35

wenjin272 requested review from alnzng and xintongsong December 14, 2025 12:46

xintongsong approved these changes Dec 16, 2025

View reviewed changes

	VectorStore = "vectorstore"
	EXTERNAL_VECTOR_STORE = "external_vector_store"

		from flink_agents.api.prompts.prompt import Prompt


		def summarize(

[api][runtime] Introduce long-term memory in python #332

Are you sure you want to change the base?

[api][runtime] Introduce long-term memory in python #332

Conversation

wenjin272 commented Nov 20, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Purpose of change

Tests

API

Documentation

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

wenjin272 commented Dec 1, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

alnzng commented Dec 3, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

wenjin272 commented Dec 3, 2025

Uh oh!

xintongsong commented Dec 3, 2025

Uh oh!

alnzng commented Dec 3, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

wenjin272 commented Dec 9, 2025

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

xintongsong left a comment

Choose a reason for hiding this comment

wenjin272 commented Nov 20, 2025 •

edited

Loading

wenjin272 commented Dec 1, 2025 •

edited

Loading

alnzng commented Dec 3, 2025 •

edited

Loading

alnzng commented Dec 3, 2025 •

edited

Loading